无价值运动捕获已成为近年来计算机视觉研究的积极研究领域。其广泛的应用在各种各样的领域中是已知的,包括计算机动画,人类运动分析,生物医学研究,虚拟现实和体育科学。估计人类姿势最近在计算机视觉界中提高了越来越长,但由于不确定性的深度和缺乏合成数据集,这是一个具有挑战性的任务。最近提出了各种方法来解决这个问题,其中许多是基于深度学习。它们主要专注于提高现有基准的性能,具有重要进展,特别是2D图像。基于强大的深度学习技术和最近收集的现实数据集,我们探讨了一个模型,可以完全基于2D图像预测动画的骨架。使用不同的身体形状从易于复杂的不同身体形状产生的不同现实世界数据集生成的帧。实施过程在自己的数据集上使用DeePlabCut来执行许多必要的步骤,然后使用输入帧训练模型。输出是人类运动的动画骨架。复合数据集和其他结果是深层模型的“地面真相”。
translated by 谷歌翻译
Air pollution is an emerging problem that needs to be solved especially in developed and developing countries. In Vietnam, air pollution is also a concerning issue in big cities such as Hanoi and Ho Chi Minh cities where air pollution comes mostly from vehicles such as cars and motorbikes. In order to tackle the problem, the paper focuses on developing a solution that can estimate the emitted PM2.5 pollutants by counting the number of vehicles in the traffic. We first investigated among the recent object detection models and developed our own traffic surveillance system. The observed traffic density showed a similar trend to the measured PM2.5 with a certain lagging in time, suggesting a relation between traffic density and PM2.5. We further express this relationship with a mathematical model which can estimate the PM2.5 value based on the observed traffic density. The estimated result showed a great correlation with the measured PM2.5 plots in the urban area context.
translated by 谷歌翻译
水果苍蝇是果实产量最有害的昆虫物种之一。在AlertTrap中,使用不同的最先进的骨干功能提取器(如MobiLenetv1和MobileNetv2)的SSD架构的实现似乎是实时检测问题的潜在解决方案。SSD-MobileNetv1和SSD-MobileNetv2表现良好并导致AP至0.5分别为0.957和1.0。YOLOV4-TINY优于SSD家族,在AP@0.5中为1.0;但是,其吞吐量速度略微慢。
translated by 谷歌翻译
在本文中,我们研究了Wassersein距离的统计推断,吸引了很多关注,并已应用于各种机器学习任务。在文献中提出了几项研究,但几乎所有的研究都基于渐近近似,并且没有有限样的有效性。在本研究中,我们提出了一种精确的(非渐近)推理方法,用于由条件选择性推理(Si)的概念的启发的Wassersein距离。为了我们的知识,这是第一种方法,可以为Wassersein距离提供有限的样本覆盖保证的有效置信区间(CI),这不仅可以应用于一维问题,而且可以应用于多维问题。我们评估了在合成和现实世界数据集中所提出的方法的性能。
translated by 谷歌翻译
在嘈杂环境下的实际数据分析中,常常首先使用强大的方法来识别异常值,然后在删除异常值后进行进一步的分析。在本文中,我们考虑除去异常值后估计的模型的统计推断,可以解释为选择性推理(SI)问题。要使用条件SI框架,有必要表征鲁棒方法如何标识异常值的事件。遗憾的是,这里不能直接使用现有方法,因为它们适用于选择事件可以通过线性/二次约束表示的情况。在本文中,我们提出了通过使用同型方法对流行的鲁棒回归的条件SI方法。我们表明,所提出的条件SI方法适用于广泛的稳健回归和异常检测方法,对合成数据和实际数据实验具有良好的经验性能。
translated by 谷歌翻译
虽然庞大的文献涉及使用深神经网络(DNN)的图像分割方法,但是已经支付了不太关注,以评估分割结果的统计可靠性。在这项研究中,我们将分割结果解释为由DNN(称为DNN驱动假设)驱动的假设,并提出一种方法,通过该方法来量化统计假设检测框架内这些假设的可靠性。具体而言,我们考虑对象和背景区域之间的差异统计假设试验。这个问题是具有挑战性的,因为由于DNN对数据的改编,差异将是错误的。为了克服这种困难,我们介绍了一个条件选择性推理(SI)框架 - 一个新的统计推理框架,用于数据驱动的假设,最近接受了相当大的关注 - 计算分割的精确(非渐近)有效的p值结果。为了使用基于DNN的分割的条件SI框架,我们开发了一种基于同型方法的新型SI算法,使我们能够导出DNN驱动假设的精确(非渐近)采样分布。我们在合成和现实世界数据集中进行实验,我们提供了证据表明我们所提出的方法可以成功控制错误阳性率,在计算效率方面具有良好的性能,并在应用于医学图像数据时提供良好的结果。
translated by 谷歌翻译
We propose a new causal inference framework to learn causal effects from multiple, decentralized data sources in a federated setting. We introduce an adaptive transfer algorithm that learns the similarities among the data sources by utilizing Random Fourier Features to disentangle the loss function into multiple components, each of which is associated with a data source. The data sources may have different distributions; the causal effects are independently and systematically incorporated. The proposed method estimates the similarities among the sources through transfer coefficients, and hence requiring no prior information about the similarity measures. The heterogeneous causal effects can be estimated with no sharing of the raw training data among the sources, thus minimizing the risk of privacy leak. We also provide minimax lower bounds to assess the quality of the parameters learned from the disparate sources. The proposed method is empirically shown to outperform the baselines on decentralized data sources with dissimilar distributions.
translated by 谷歌翻译
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.
translated by 谷歌翻译
This study proposes an approach for establishing an optimal multihop ad-hoc network using multiple unmanned aerial vehicles (UAVs) to provide emergency communication in disaster areas. The approach includes two stages, one uses particle swarm optimization (PSO) to find optimal positions to deploy UAVs, and the other uses a behavior-based controller to navigate the UAVs to their assigned positions without colliding with obstacles in an unknown environment. Several constraints related to the UAVs' sensing and communication ranges have been imposed to ensure the applicability of the proposed approach in real-world scenarios. A number of simulation experiments with data loaded from real environments have been conducted. The results show that our proposed approach is not only successful in establishing multihop ad-hoc routes but also meets the requirements for real-time deployment of UAVs.
translated by 谷歌翻译
We propose a combined three pre-trained language models (XLM-R, BART, and DeBERTa-V3) as an empower of contextualized embedding for named entity recognition. Our model achieves a 92.9% F1 score on the test set and ranks 5th on the leaderboard at NL4Opt competition subtask 1.
translated by 谷歌翻译